A common tool used by paleontologists to understand the physiology, ecology and evolution of fossil mammals (and also fossil vertebrates) is reconstructing body size. In mammals in particular, this can be done using dimensions of the appendicular skeletal elements and dental morphology. For rodents in particular this most commonly is seen using the first lower molar (m1), but in this paper a proxy for body mass is developed using toothrow length and area for the dentary (lower jaw).
The main objective of this publication was to estimate body mass for extinct rodent taxa from dental toothrow length and area and determine the effectiveness of these proxies compared with previously published estimations of body mass from femur diameter.
This study showed that tooth dimensions are accurate for estimating body mass when looking at interspecific comparisons, and even omre accurate proxies when looking at narrow subclades within Rodentia that are more closely related to a fossil taxa of interest.
In the following code I will replicate the regression analyses for establishing both lower toothrow length and rectangular toothrow area as proxies for estimating body mass. The fossil taxa utilized in this study did not have dental measurements provided so could not be included in this replication assignment, but summary statistics and visualizations of the proxy data can be found below.
Libraries Used
library(tidyverse)
library(dplyr)
library(tidyr)
library(cowplot)
library(broom)
library(sjPlot)
library(sjmisc)
library(sjlabelled)
library(sjstats)
library(jtools)
Loading in Dataset
rodents <- "https://raw.githubusercontent.com/allyboville3/Data-Replication-Assignment/main/Hopkins%202008%20-%20Appendix1.csv"
rodent_df <- read_csv(rodents, col_names = TRUE)
head(rodent_df)
## # A tibble: 6 × 7
## Species `MVZ specimen` `mass (g)` `LTRL (mm)` `M1 width (mm)` subclade
## <chr> <dbl> <chr> <dbl> <dbl> <chr>
## 1 Aplodontia rufa 172402 1465 17.5 4.13 Aplodon…
## 2 Aplodontia rufa 22624 1375 18.2 3.83 Aplodon…
## 3 Aplodontia rufa 22626 1350 17.9 3.84 Aplodon…
## 4 Aplodontia rufa 22627 1048.2 17.5 3.61 Aplodon…
## 5 Aplodontia rufa 116278 1069 17.6 3.35 Aplodon…
## 6 Aplodontia rufa 96062 832 17.1 4.01 Aplodon…
## # ℹ 1 more variable: `pub. avg. mass (g)` <chr>
RTRA = (LTRL X M1 width)
With the acknowledgement that we are making the assumption that rodent cheek teeth are essentially rectangular in shape (and that this may result in an overestimation of body mass), area was calculated by multiplying the length of the toothrow by the width of the first lower molar. The first is most often the largest while the first and second molar are the widest. M1 width in this data refers to either the first or second molar width which can be assumed to be the same and representative of the maximum width of the toothrow.
rodent_with_RTRA <- rodent_df %>%
mutate(RTRA = (`LTRL (mm)` * `M1 width (mm)`))
rodent_with_RTRA
## # A tibble: 425 × 8
## Species `MVZ specimen` `mass (g)` `LTRL (mm)` `M1 width (mm)` subclade
## <chr> <dbl> <chr> <dbl> <dbl> <chr>
## 1 Aplodontia ru… 172402 1465 17.5 4.13 Aplodon…
## 2 Aplodontia ru… 22624 1375 18.2 3.83 Aplodon…
## 3 Aplodontia ru… 22626 1350 17.9 3.84 Aplodon…
## 4 Aplodontia ru… 22627 1048.2 17.5 3.61 Aplodon…
## 5 Aplodontia ru… 116278 1069 17.6 3.35 Aplodon…
## 6 Aplodontia ru… 96062 832 17.1 4.01 Aplodon…
## 7 Aplodontia ru… 114341 926 16.4 3.99 Aplodon…
## 8 Aplodontia ru… 96058 760.7 17.2 3.82 Aplodon…
## 9 Aplodontia ru… 96063 820.9 16.3 3.68 Aplodon…
## 10 Aplodontia ru… 115540 741 16.8 3.76 Aplodon…
## # ℹ 415 more rows
## # ℹ 2 more variables: `pub. avg. mass (g)` <chr>, RTRA <dbl>
rodent_avg <- rodent_with_RTRA %>%
mutate(Mass = as.numeric(`mass (g)`), Pub_avg_mass = as.numeric(`pub. avg. mass (g)`), RTRA = as.numeric(`RTRA`)) %>% #change data for Mass, Published Average Mass, and Retangular Toothrow Area from type chr to dbl
group_by(Species) %>% #grouping data by Species
tidyr::fill(Pub_avg_mass, .direction ="down") %>% #filling in NA's in published averaged masses to make the following code happy
#"The average body mass reported by Ernest (2003) was used for 3 species (Hydrochoerus hydrochaeris, Graphiurus murinus, and Cryptomys hottentotus) for which museum specimens lacked data on body mass." - Copied from Hopkins 2008
mutate(Mass = coalesce(Mass,Pub_avg_mass)) %>% #Sub NA values for Mass with Previously published mass averages
summarize(
Mass = mean(`Mass`), #averages mass for each species
LTRL = mean(`LTRL (mm)`), #averages toothrow length for each species
`Toothrow_Width` = mean(`M1 width (mm)`), #averages toothrow width (based off of M1 width) for each species
RTRA = mean(RTRA)
) %>%
mutate(across(where(is.numeric), ~ round(., 2))) # rounding data to only have two decimal places
rodent_avg
## # A tibble: 75 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 4 Aplodontia rufa 1128. 17.4 3.79 66.0
## 5 Apodemus semotus 27 4.32 1.22 5.25
## 6 Arborimus pomo 27.1 5.81 1.29 7.55
## 7 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 8 Berylomys bowersi 426. 9.11 2.49 22.7
## 9 Castor canadensis 16060 32.7 7.81 256.
## 10 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## # ℹ 65 more rows
interspecies_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_avg) #linear regression model
s <- summary(interspecies_lm)
tidy(s, conf.int = TRUE) #TEST to take a look at confidence intervals
## # A tibble: 2 × 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.709 0.172 -4.12 9.81e- 5 -1.05 -0.366
## 2 log(LTRL) 2.79 0.0829 33.6 3.58e-46 2.62 2.95
tidy(interspecies_lm) #tables showing summary statistsics for this lm
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.709 0.172 -4.12 9.81e- 5
## 2 log(LTRL) 2.79 0.0829 33.6 3.58e-46
glance(interspecies_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.939 0.939 0.426 1131. 3.58e-46 1 -41.5 88.9 95.9
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_All <- ggplot(data = rodent_avg, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,5) + #matching dimensions of plot to figure from publication
ylim(0,12) +
theme_classic(base_size = 16)
p_All
# 2. Species Less than 5kg Essentially is removing the four largest
rodent species. This was done because for large rodents there may be
different factors influencing dentition shape and size compared to
smaller rodents. Including these larger taxa may result in the
regression being misleading.
#removing four largest rodent species from data
rodent_no_outlier <- rodent_avg %>%
filter(!Species == "Erithizon dorsatum", !Species == "Castor canadensis", !Species == "Hydrochoeris hydrochoeris", !Species == "Cuniculus paca")
rodent_no_outlier
## # A tibble: 71 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 4 Aplodontia rufa 1128. 17.4 3.79 66.0
## 5 Apodemus semotus 27 4.32 1.22 5.25
## 6 Arborimus pomo 27.1 5.81 1.29 7.55
## 7 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 8 Berylomys bowersi 426. 9.11 2.49 22.7
## 9 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## 10 Clethrionomys rutilus 27 5.4 1.09 5.89
## # ℹ 61 more rows
interspecies_small5kg_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_no_outlier)
tidy(interspecies_small5kg_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.615 0.221 -2.79 6.86e- 3
## 2 log(LTRL) 2.73 0.113 24.1 2.58e-35
glance(interspecies_small5kg_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.894 0.892 0.428 580. 2.58e-35 1 -39.4 84.8 91.6
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_no_outlier <- ggplot(data = rodent_no_outlier, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,4) +
ylim(0,9) +
theme_classic(base_size = 16)
p_no_outlier
# 3. Species less than 500g Filtering data to only all species with mass
<500g
rodent_smol <- rodent_avg %>%
filter(Mass < 500)
rodent_smol
## # A tibble: 61 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 4 Apodemus semotus 27 4.32 1.22 5.25
## 5 Arborimus pomo 27.1 5.81 1.29 7.55
## 6 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 7 Berylomys bowersi 426. 9.11 2.49 22.7
## 8 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## 9 Clethrionomys rutilus 27 5.4 1.09 5.89
## 10 Cryptomys hottentotus 132. 6.08 1.89 11.5
## # ℹ 51 more rows
interspecies_smol_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_smol)
tidy(interspecies_smol_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.290 0.284 -1.02 3.11e- 1
## 2 log(LTRL) 2.53 0.157 16.1 2.54e-23
glance(interspecies_smol_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.815 0.812 0.434 261. 2.54e-23 1 -34.6 75.2 81.5
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_rodent_smol <- ggplot(data = rodent_smol, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,3) +
ylim(0,7) +
theme_classic(base_size = 16)
p_rodent_smol
Have to do so with original version of df because the subclade column was removed for previous code chunks
rodent_muriods <- rodent_with_RTRA %>%
filter(subclade == "Murinae" | subclade == "Neotominae" | subclade == "Arvicolinae" | subclade == "Deomyinae" | subclade == "Gerbillinae" | subclade == "Sigmodontinae" | subclade == "Spalacinae") #specifying subclades within Muroidea
rodent_muriods
## # A tibble: 225 × 8
## Species `MVZ specimen` `mass (g)` `LTRL (mm)` `M1 width (mm)` subclade
## <chr> <dbl> <chr> <dbl> <dbl> <chr>
## 1 Arborimus pomo 99621 24.5 5.85 1.31 Arvicol…
## 2 Arborimus pomo 99622 24.2 5.97 1.33 Arvicol…
## 3 Arborimus pomo 99623 22.3 5.56 1.14 Arvicol…
## 4 Arborimus pomo 99624 26.3 5.48 1.2 Arvicol…
## 5 Arborimus pomo 120595 38.3 6.2 1.49 Arvicol…
## 6 Clethrionomys… 30723 18.6 4.94 0.92 Arvicol…
## 7 Clethrionomys… 30724 20.6 5.06 1.09 Arvicol…
## 8 Clethrionomys… 30725 20 5.13 1.03 Arvicol…
## 9 Clethrionomys… 30726 38.6 5.22 1.04 Arvicol…
## 10 Clethrionomys… 30727 22.5 4.93 0.95 Arvicol…
## # ℹ 215 more rows
## # ℹ 2 more variables: `pub. avg. mass (g)` <chr>, RTRA <dbl>
r_muriod_avg <- rodent_muriods %>%
mutate(Mass = as.numeric(`mass (g)`),
Pub_avg_mass = as.numeric(`pub. avg. mass (g)`),
RTRA = as.numeric(`RTRA`)) %>% #change data for Mass, Published Average Mass, and Retangular Toothrow Area from type chr to dbl
group_by(Species) %>% #grouping data by Species
tidyr::fill(Pub_avg_mass, .direction ="down") %>% #filling in NA's in published averaged masses to make the following code happy
mutate(Mass = coalesce(Mass,Pub_avg_mass)) %>% #Sub NA values for Mass with Previously published mass averages
summarize(
Mass = mean(`Mass`), #averages mass for each species
LTRL = mean(`LTRL (mm)`), #averages toothrow length for each species
`Toothrow_Width` = mean(`M1 width (mm)`), #averages toothrow width (based off of M1 width) for each species
RTRA = mean(RTRA)
) %>%
mutate(across(where(is.numeric), ~ round(., 2))) # rounding data to only have two decimal places
r_muriod_avg
## # A tibble: 43 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Apodemus semotus 27 4.32 1.22 5.25
## 4 Arborimus pomo 27.1 5.81 1.29 7.55
## 5 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 6 Berylomys bowersi 426. 9.11 2.49 22.7
## 7 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## 8 Clethrionomys rutilus 27 5.4 1.09 5.89
## 9 Dicrostonyx groenlandicus 65.1 6.94 1.3 9.08
## 10 Graomys griseoflavus 54.4 5.36 1.47 7.9
## # ℹ 33 more rows
Should be the same as the df above
rodent_lt5kg_muroid <- r_muriod_avg %>%
filter(Mass < 5000) #all murine rodents present in this study have a body mass of less than 5kg
rodent_lt5kg_muroid
## # A tibble: 43 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Apodemus semotus 27 4.32 1.22 5.25
## 4 Arborimus pomo 27.1 5.81 1.29 7.55
## 5 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 6 Berylomys bowersi 426. 9.11 2.49 22.7
## 7 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## 8 Clethrionomys rutilus 27 5.4 1.09 5.89
## 9 Dicrostonyx groenlandicus 65.1 6.94 1.3 9.08
## 10 Graomys griseoflavus 54.4 5.36 1.47 7.9
## # ℹ 33 more rows
muroid_lt5kg_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_lt5kg_muroid)
tidy(muroid_lt5kg_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.679 0.356 -1.91 6.35e- 2
## 2 log(LTRL) 2.73 0.195 14.0 3.10e-17
glance(muroid_lt5kg_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.827 0.823 0.476 197. 3.10e-17 1 -28.0 62.1 67.4
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_muroid_lt5kg <- ggplot(data = rodent_lt5kg_muroid, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,3) +
ylim(0,8) +
theme_classic(base_size = 16)
p_muroid_lt5kg
Should be the same as the df above
rodent_lt500g_muroid <- r_muriod_avg %>%
filter(Mass < 500)
rodent_lt500g_muroid
## # A tibble: 40 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Aethomys hindei 98.4 6.29 1.84 11.6
## 2 Akodon subfuscus 18.8 4.08 1 4.09
## 3 Apodemus semotus 27 4.32 1.22 5.25
## 4 Arborimus pomo 27.1 5.81 1.29 7.55
## 5 Arvicanthus abysinnicus 82.2 6.24 1.77 11.0
## 6 Berylomys bowersi 426. 9.11 2.49 22.7
## 7 Clethrionomys gappperi 24.1 5.06 1.01 5.09
## 8 Clethrionomys rutilus 27 5.4 1.09 5.89
## 9 Dicrostonyx groenlandicus 65.1 6.94 1.3 9.08
## 10 Graomys griseoflavus 54.4 5.36 1.47 7.9
## # ℹ 30 more rows
muroid_lt500g_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_lt500g_muroid)
tidy(muroid_lt5kg_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.679 0.356 -1.91 6.35e- 2
## 2 log(LTRL) 2.73 0.195 14.0 3.10e-17
glance(muroid_lt5kg_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.827 0.823 0.476 197. 3.10e-17 1 -28.0 62.1 67.4
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_muroid_lt500g <- ggplot(data = rodent_lt500g_muroid, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,3) +
ylim(0,8) +
theme_classic(base_size = 16)
p_muroid_lt500g
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Subclade
Grouping 2 - Non-muroid Rodents #
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
rodent_non_muriods <- rodent_with_RTRA %>%
filter(!subclade == "Murinae" , !subclade == "Neotominae" , !subclade == "Arvicolinae" , !subclade == "Deomyinae" , !subclade == "Gerbillinae" , !subclade == "Sigmodontinae" , !subclade == "Spalacinae") #removing all subclades belonging to Muroidea
rodent_non_muriods
## # A tibble: 200 × 8
## Species `MVZ specimen` `mass (g)` `LTRL (mm)` `M1 width (mm)` subclade
## <chr> <dbl> <chr> <dbl> <dbl> <chr>
## 1 Aplodontia ru… 172402 1465 17.5 4.13 Aplodon…
## 2 Aplodontia ru… 22624 1375 18.2 3.83 Aplodon…
## 3 Aplodontia ru… 22626 1350 17.9 3.84 Aplodon…
## 4 Aplodontia ru… 22627 1048.2 17.5 3.61 Aplodon…
## 5 Aplodontia ru… 116278 1069 17.6 3.35 Aplodon…
## 6 Aplodontia ru… 96062 832 17.1 4.01 Aplodon…
## 7 Aplodontia ru… 114341 926 16.4 3.99 Aplodon…
## 8 Aplodontia ru… 96058 760.7 17.2 3.82 Aplodon…
## 9 Aplodontia ru… 96063 820.9 16.3 3.68 Aplodon…
## 10 Aplodontia ru… 115540 741 16.8 3.76 Aplodon…
## # ℹ 190 more rows
## # ℹ 2 more variables: `pub. avg. mass (g)` <chr>, RTRA <dbl>
r_non_muriod_avg <- rodent_non_muriods %>%
mutate(Mass = as.numeric(`mass (g)`),
Pub_avg_mass = as.numeric(`pub. avg. mass (g)`),
RTRA = as.numeric(`RTRA`)) %>% #change data for Mass, Published Average Mass, and Retangular Toothrow Area from type chr to dbl
group_by(Species) %>% #grouping data by Species
tidyr::fill(Pub_avg_mass, .direction ="down") %>% #filling in NA's in published averaged masses to make the following code happy
#"The average body mass reported by Ernest (2003) was used for 3 species (Hydrochoerus hydrochaeris, Graphiurus murinus, and Cryptomys hottentotus) for which museum specimens lacked data on body mass." - Copied from Hopkins 2008
mutate(Mass = coalesce(Mass,Pub_avg_mass)) %>% #Sub NA values for Mass with Previously published mass averages
summarize(
Mass = mean(`Mass`), #averages mass for each species
LTRL = mean(`LTRL (mm)`), #averages toothrow length for each species
`Toothrow_Width` = mean(`M1 width (mm)`), #averages toothrow width (based off of M1 width) for each species
RTRA = mean(RTRA)
) %>%
mutate(across(where(is.numeric), ~ round(., 2))) # rounding data to only have two decimal places
r_non_muriod_avg
## # A tibble: 32 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 2 Aplodontia rufa 1128. 17.4 3.79 66.0
## 3 Castor canadensis 16060 32.7 7.81 256.
## 4 Cryptomys hottentotus 132. 6.08 1.89 11.5
## 5 Ctenomys maulinus 208. 10.7 2.58 27.8
## 6 Cuniculus paca 8860 30.8 7.05 218.
## 7 Cynomys mexicanus 829. 13.7 4.37 59.8
## 8 Dipodomys merriami 39.0 4.47 1.69 7.57
## 9 Erithizon dorsatum 6973. 30.5 6.32 193.
## 10 Galea musteloides 341 12.6 2.51 31.7
## # ℹ 22 more rows
rodent_lt5kg_non_muroid <- r_non_muriod_avg %>%
filter(Mass < 5000)
rodent_lt5kg_non_muroid
## # A tibble: 28 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 2 Aplodontia rufa 1128. 17.4 3.79 66.0
## 3 Cryptomys hottentotus 132. 6.08 1.89 11.5
## 4 Ctenomys maulinus 208. 10.7 2.58 27.8
## 5 Cynomys mexicanus 829. 13.7 4.37 59.8
## 6 Dipodomys merriami 39.0 4.47 1.69 7.57
## 7 Galea musteloides 341 12.6 2.51 31.7
## 8 Geomys arenarius 158. 7.07 2.07 14.7
## 9 Glaucomys sabrinus 130. 8.09 2.09 16.9
## 10 Graphiurus murinus 24 3.22 1.14 3.68
## # ℹ 18 more rows
non_muroid_lt5kg_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_lt5kg_non_muroid)
tidy(non_muroid_lt5kg_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.299 0.262 -1.14 2.64e- 1
## 2 log(LTRL) 2.63 0.124 21.3 5.81e-18
glance(non_muroid_lt5kg_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.946 0.944 0.328 452. 5.81e-18 1 -7.46 20.9 24.9
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_non_muroid_lt5kg <- ggplot(data = rodent_lt5kg_non_muroid, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,4) +
ylim(0,9) +
theme_classic(base_size = 16)
p_non_muroid_lt5kg
rodent_lt500g_non_muroid <- r_non_muriod_avg %>%
filter(Mass < 500)
rodent_lt500g_non_muroid
## # A tibble: 21 × 5
## Species Mass LTRL Toothrow_Width RTRA
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Ammospermophilus leucurus 110. 6.31 1.93 12.2
## 2 Cryptomys hottentotus 132. 6.08 1.89 11.5
## 3 Ctenomys maulinus 208. 10.7 2.58 27.8
## 4 Dipodomys merriami 39.0 4.47 1.69 7.57
## 5 Galea musteloides 341 12.6 2.51 31.7
## 6 Geomys arenarius 158. 7.07 2.07 14.7
## 7 Glaucomys sabrinus 130. 8.09 2.09 16.9
## 8 Graphiurus murinus 24 3.22 1.14 3.68
## 9 Liomys pictus 33.5 4.54 1.54 6.98
## 10 Mesomys hispidus 157. 7.8 1.87 14.6
## # ℹ 11 more rows
non_muroid_lt500g_lm <- lm(log(Mass) ~ log(LTRL), data = rodent_lt500g_non_muroid)
tidy(non_muroid_lt5kg_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -0.299 0.262 -1.14 2.64e- 1
## 2 log(LTRL) 2.63 0.124 21.3 5.81e-18
glance(non_muroid_lt5kg_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.946 0.944 0.328 452. 5.81e-18 1 -7.46 20.9 24.9
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_non_muroid_lt500g <- ggplot(data = rodent_lt500g_non_muroid, aes(x = log(LTRL), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,4) +
ylim(0,9) +
theme_classic(base_size = 16)
p_non_muroid_lt500g
#Making a nice table of all Stats using sjStats package
tab_model(interspecies_lm, interspecies_small5kg_lm, interspecies_smol_lm, muroid_lt5kg_lm, non_muroid_lt5kg_lm, muroid_lt500g_lm,non_muroid_lt500g_lm,
show.intercept = TRUE, show.est = TRUE, show.ci = 0.95, show.se = TRUE, show.p = TRUE, show.r2 = TRUE, show.re.var = TRUE,
dv.labels = c("All Species", "All <5kg", "All <500g", "Muroidea <5kg", "Non-Muroidea <5kg", "Muroidea <500g", "Non-Muroidea <500g"))
| Â | All Species | All <5kg | All <500g | Muroidea <5kg | Non-Muroidea <5kg | Muroidea <500g | Non-Muroidea <500g | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p |
| (Intercept) | -0.71 | 0.17 | -1.05 – -0.37 | <0.001 | -0.62 | 0.22 | -1.06 – -0.17 | 0.007 | -0.29 | 0.28 | -0.86 – 0.28 | 0.311 | -0.68 | 0.36 | -1.40 – 0.04 | 0.064 | -0.30 | 0.26 | -0.84 – 0.24 | 0.264 | -0.57 | 0.45 | -1.47 – 0.34 | 0.212 | 0.19 | 0.28 | -0.39 – 0.76 | 0.508 |
| LTRL [log] | 2.79 | 0.08 | 2.62 – 2.95 | <0.001 | 2.73 | 0.11 | 2.51 – 2.96 | <0.001 | 2.53 | 0.16 | 2.22 – 2.84 | <0.001 | 2.73 | 0.19 | 2.34 – 3.13 | <0.001 | 2.63 | 0.12 | 2.37 – 2.88 | <0.001 | 2.66 | 0.25 | 2.15 – 3.18 | <0.001 | 2.32 | 0.14 | 2.02 – 2.62 | <0.001 |
| Observations | 75 | 71 | 61 | 43 | 28 | 40 | 21 | |||||||||||||||||||||
| R2 / R2 adjusted | 0.939 / 0.939 | 0.894 / 0.892 | 0.815 / 0.812 | 0.827 / 0.823 | 0.946 / 0.944 | 0.743 / 0.736 | 0.932 / 0.928 | |||||||||||||||||||||
Comparing my results with Table 2 from the paper, you can see that the values are quite similar for all datasets and statistical parameters. One thing I do want to note is the difference in the d.f for Non-Muroidea (both mass groupings). I checked the data multiple times and I think this must have been a small error in the publication because both the dataset provided in the publication and all of my data loaded into R has corresponded to the numbers provided in my analysis. This change appears to have minimal to no impact on my results but is still worth noting as it is a distinct difference.
The author of this paper did not end up providing confidence interval values but provided the equations (pictured above) and a table of the values for each variable across all analysis subgroups for both LTRL and RTRA (Table 5 - above). I attempted to determine how she calculated some of these variables but was unable to determine how she calculated the standard error of the regression (and how this is different from the standard error of the regression slope and intercept) and am honestly not sure how to proceed. I did calculate the confidence intervals but am unable to compare with Dr.Hopkins and determine how accurate my calculations are.
detach("package:sjPlot", unload=TRUE) # detaching sjPlot package to remove conflicts with plot_grid function below
bottom_square <- plot_grid(p_no_outlier, p_rodent_smol, p_non_muroid_lt5kg, p_muroid_lt5kg,
labels = c("B. <5kg", "C. <500g", "D. non-muroids <5kg", "E. muroids <5kg"),
label_size = 16,
hjust = 0,
label_x = 0.15)
plot_grid(p_All, bottom_square, labels = c("A. all species"), label_size = 16, ncol = 1, hjust = 0,
label_x = 0.1, rel_heights = c(1, 1))
Comparing my analyses with Figure 2 from Hopkins 2008 you can see that my plots are quite similar if not exactly the same!! Across all groupings there is a significant and clear linear relationship between lower toothrow lenght and body mass. Woohoo for data replication being successful!!
interspecies_rtra_lm <- lm(log(Mass) ~ log(RTRA), data = rodent_avg)
tidy(interspecies_rtra_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.923 0.0931 9.92 3.64e-15
## 2 log(RTRA) 1.50 0.0328 45.7 1.92e-55
glance(interspecies_rtra_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.966 0.966 0.318 2086. 1.92e-55 1 -19.5 45.1 52.0
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_All_RTRA <- ggplot(data = rodent_avg, aes(x = log(RTRA), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,7) +
ylim(0,12) +
theme_classic(base_size = 16)
p_All_RTRA
interspecies_small5kg_rtra_lm <- lm(log(Mass) ~ log(RTRA), data = rodent_no_outlier)
tidy(interspecies_small5kg_rtra_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 1.03 0.113 9.14 1.65e-13
## 2 log(RTRA) 1.45 0.0437 33.2 3.93e-44
glance(interspecies_small5kg_rtra_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.941 0.940 0.319 1100. 3.93e-44 1 -18.5 43.1 49.8
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_interspecies_small5kg_rtra <- ggplot(data = rodent_no_outlier, aes(x = log(RTRA), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,5) +
ylim(0,9) +
theme_classic(base_size = 16)
p_interspecies_small5kg_rtra
interspecies_lt500g_rtra_lm <- lm(log(Mass) ~ log(RTRA), data = rodent_smol)
tidy(interspecies_lt500g_rtra_lm)
## # A tibble: 2 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 1.05 0.148 7.12 1.72e- 9
## 2 log(RTRA) 1.44 0.0649 22.1 2.81e-30
glance(interspecies_lt500g_rtra_lm)
## # A tibble: 1 × 12
## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.893 0.891 0.331 491. 2.81e-30 1 -18.1 42.2 48.5
## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
p_interspecies_lt500_rtra <- ggplot(data = rodent_smol, aes(x = log(RTRA), y = log(Mass))) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ x) +
xlim(0,4) +
ylim(0,7) +
theme_classic(base_size = 16)
p_interspecies_lt500_rtra
library(sjPlot)
tab_model(interspecies_rtra_lm, interspecies_small5kg_rtra_lm, interspecies_lt500g_rtra_lm,
show.intercept = TRUE, show.est = TRUE, show.ci = 0.95, show.se = TRUE, show.p = TRUE, show.r2 = TRUE, show.re.var = TRUE,
dv.labels = c("All Species", "All <5kg", "All <500g"))
| Â | All Species | All <5kg | All <500g | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p | Estimates | std. Error | CI | p |
| (Intercept) | 0.92 | 0.09 | 0.74 – 1.11 | <0.001 | 1.03 | 0.11 | 0.81 – 1.26 | <0.001 | 1.05 | 0.15 | 0.76 – 1.35 | <0.001 |
| RTRA [log] | 1.50 | 0.03 | 1.43 – 1.56 | <0.001 | 1.45 | 0.04 | 1.36 – 1.54 | <0.001 | 1.44 | 0.06 | 1.31 – 1.57 | <0.001 |
| Observations | 75 | 71 | 61 | |||||||||
| R2 / R2 adjusted | 0.966 / 0.966 | 0.941 / 0.940 | 0.893 / 0.891 | |||||||||
detach("package:sjPlot", unload=TRUE)
Again comparing all statistical paramters
you can see that between my replication of analyses and Table 4 (above)
that my results returned very similar values and show that RTRA is in
fact a significant predictor of Body Mass as well!